C2P: Clustering based on Closest Pairs

نویسندگان

  • Alexandros Nanopoulos
  • Yannis Theodoridis
  • Yannis Manolopoulos
چکیده

In this paper we present CP, a new clustering algorithm for large spatial databases, which exploits spatial access methods for the determination of closest pairs. Several extensions are presented for scalable clustering in large databases that contain clusters of various shapes and outliers. Due to its characteristics, the proposed algorithm attains the advantages of hierarchical clustering and graphtheoretic algorithms providing both efficiency and quality of clustering result. The superiority of CP is verified both with analytical and experimental results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Indexing for Data Mining Applications

Efficient and scalable algorithms and access methods that solve closest point problems are critical to many data mining applications. Data mining algorithms often need to find close or similar objects or pairs of similar objects. Some notable examples are the following: Time Series and Sequence Databases: In this application, the objects are typically subsequences within a time series sequence....

متن کامل

C2P: Co-operative Caching in Distributed Storage Systems

Distributed storage systems (e.g. clustered filesystems HDFS, GPFS and Object Stores Openstack swift ) often partition sequential data across storage systems for performance ( data striping) or protection (Erasure-Coding) . This partitioning leads to logically correlated data being stored on different physical storage devices, which operate autonomously. This un-coordinated operation may lead t...

متن کامل

Recurrence risk score based on the specific activity of CDK1 and CDK2 predicts response to neoadjuvant paclitaxel followed by 5-fluorouracil, epirubicin and cyclophosphamide in breast cancers.

BACKGROUND We established the cell cycle profiling (C2P) assay for specific activity (SA; activity/expression) of cyclin-dependent kinases (CDKs). C2P risk score (C2P-RS) based on CDK1 and CDK2 SAs was significantly associated with relapse in breast cancer (BC). This study was conducted to investigate the predictive value of C2P-RS for neoadjuvant chemotherapy (NAC). PATIENTS AND METHODS Amon...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

Hiding Data in VQ-compressed Images Using Dissimilar Pairs

Steganography is the art and science of embedding secret data in another medium to prevent the leakage of secret information. A VQ-based (vector quantization) steganographic method usually involves changes of the block values in the VQ images, which might cause serious distortion. As a result, many existing methods use closest pairs or clustering techniques to preserve an acceptable image quali...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001